AITopics | semi-implicit variational inference

9bbb3c3aa33616c55521e2f826c132bd-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 02:51:42 GMT

conditional layer, hsivi-sm, variational inference, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

A Kernel Approach for Semi-implicit Variational Inference

Yu, Longlin, Cheng, Ziheng, Zhang, Shiyue, Zhang, Cheng

arXiv.org Machine LearningJan-21-2026

Semi-implicit variational inference (SIVI) enhances the expressiveness of variational families through hierarchical semi-implicit distributions, but the intractability of their densities makes standard ELBO-based optimization biased. Recent score-matching approaches to SIVI (SIVI-SM) address this issue via a minimax formulation, at the expense of an additional lower-level optimization problem. In this paper, we propose kernel semi-implicit variational inference (KSIVI), a principled and tractable alternative that eliminates the lower-level optimization by leveraging kernel methods. We show that when optimizing over a reproducing kernel Hilbert space, the lower-level problem admits an explicit solution, reducing the objective to the kernel Stein discrepancy (KSD). Exploiting the hierarchical structure of semi-implicit distributions, the resulting KSD objective can be efficiently optimized using stochastic gradient methods. We establish optimization guarantees via variance bounds on Monte Carlo gradient estimators and derive statistical generalization bounds of order $\tilde{\mathcal{O}}(1/\sqrt{n})$. We further introduce a multi-layer hierarchical extension that improves expressiveness while preserving tractability. Empirical results on synthetic and real-world Bayesian inference tasks demonstrate the effectiveness of KSIVI.

artificial intelligence, machine learning, variational inference, (15 more...)

arXiv.org Machine Learning

2601.12023

Country: North America > United States (0.92)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Importance Weighted Hierarchical Variational Inference

Neural Information Processing SystemsDec-25-2025, 10:11:18 GMT

Variational Inference is a powerful tool in the Bayesian modeling toolkit, however, its effectiveness is determined by the expressivity of the utilized variational distributions in terms of their ability to match the true posterior distribution. In turn, the expressivity of the variational family is largely limited by the requirement of having a tractable density function. To overcome this roadblock, we introduce a new family of variational upper bounds on a marginal log-density in the case of hierarchical models (also known as latent variable models). We then derive a family of increasingly tighter variational lower bounds on the otherwise intractable standard evidence lower bound for hierarchical variational distributions, enabling the use of more expressive approximate posteriors. We show that previously known methods, such as Hierarchical Variational Models, Semi-Implicit Variational Inference and Doubly Semi-Implicit Variational Inference can be seen as special cases of the proposed approach, and empirically demonstrate superior performance of the proposed method in a set of experiments.

importance weighted hierarchical variational inference, name change, variational inference, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.61)

Add feedback

Hierarchical Semi-Implicit Variational Inference with Application to Diffusion Model Acceleration Longlin Y u

Neural Information Processing SystemsOct-9-2025, 02:36:32 GMT

Moreover, given pre-trained score networks, HSIVI can be used to accelerate the sampling process of diffusion models with the score matching objective.

artificial intelligence, hsivi-sm, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Continuous Semi-Implicit Models

Yu, Longlin, Zha, Jiajun, Yang, Tong, Xie, Tianyu, Zhang, Xiangyu, Chan, S. -H. Gary, Zhang, Cheng

arXiv.org Machine LearningJun-10-2025

Semi-implicit distributions have shown great promise in variational inference and generative modeling. Hierarchical semi-implicit models, which stack multiple semi-implicit layers, enhance the expressiveness of semi-implicit distributions and can be used to accelerate diffusion models given pretrained score networks. However, their sequential training often suffers from slow convergence. In this paper, we introduce CoSIM, a continuous semi-implicit model that extends hierarchical semi-implicit models into a continuous framework. By incorporating a continuous transition kernel, CoSIM enables efficient, simulation-free training. Furthermore, we show that CoSIM achieves consistency with a carefully designed transition kernel, offering a novel approach for multistep distillation of generative models at the distributional level. Extensive experiments on image generation demonstrate that CoSIM performs on par or better than existing diffusion model acceleration methods, achieving superior performance on FD-DINOv2.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Machine Learning

2506.06778

Country:

Asia > China > Beijing > Beijing (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Semi-Implicit Variational Inference via Kernelized Path Gradient Descent

Pielok, Tobias, Bischl, Bernd, Rügamer, David

arXiv.org Artificial IntelligenceJun-6-2025

Semi-implicit variational inference (SIVI) is a powerful framework for approximating complex posterior distributions, but training with the Kullback-Leibler (KL) divergence can be challenging due to high variance and bias in high-dimensional settings. While current state-of-the-art semi-implicit variational inference methods, particularly Kernel Semi-Implicit Variational Inference (KSIVI), have been shown to work in high dimensions, training remains moderately expensive. In this work, we propose a kernelized KL divergence estimator that stabilizes training through nonparametric smoothing. To further reduce the bias, we introduce an importance sampling correction. We provide a theoretical connection to the amortized version of the Stein variational gradient descent, which estimates the score gradient via Stein's identity, showing that both methods minimize the same objective, but our semi-implicit approach achieves lower gradient variance. In addition, our method's bias in function space is benign, leading to more stable and efficient optimization. Empirical results demonstrate that our method outperforms or matches state-of-the-art SIVI methods in both performance and training efficiency.

artificial intelligence, machine learning, variational inference, (17 more...)

arXiv.org Artificial Intelligence

2506.05088

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.71)

Add feedback

Importance Weighted Hierarchical Variational Inference

Neural Information Processing SystemsMay-27-2025, 11:04:13 GMT

Variational Inference is a powerful tool in the Bayesian modeling toolkit, however, its effectiveness is determined by the expressivity of the utilized variational distributions in terms of their ability to match the true posterior distribution. In turn, the expressivity of the variational family is largely limited by the requirement of having a tractable density function. To overcome this roadblock, we introduce a new family of variational upper bounds on a marginal log-density in the case of hierarchical models (also known as latent variable models). We then derive a family of increasingly tighter variational lower bounds on the otherwise intractable standard evidence lower bound for hierarchical variational distributions, enabling the use of more expressive approximate posteriors. We show that previously known methods, such as Hierarchical Variational Models, Semi-Implicit Variational Inference and Doubly Semi-Implicit Variational Inference can be seen as special cases of the proposed approach, and empirically demonstrate superior performance of the proposed method in a set of experiments.

importance weighted hierarchical variational inference, semi-implicit variational inference, variational inference, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.65)

Add feedback

Review for NeurIPS paper: Learning to Learn Variational Semantic Memory

Neural Information Processing SystemsJan-25-2025, 06:04:15 GMT

Correctness: As mentioned above, I am a bit skeptical about the technical correctness for the variational inference framework. Specifically, - I think the latent z in Eq.(2) does not properly represent the class prototypes as z is conditioned on each individual x, not a entire class set (But on the other hand, Figure 1 shows that the latent z is conditioned on each of the class sets, and I'm confused which one is right). I don't understand how the approximate posterior q(z S) can have dependency on S, because according to the generative process defined by Eq.(2), the true posterior p(z x,y) does not have the dependency on the entire class set S except for each individual point (x,y). If it is not included, then the inference of m should be based on semi-implicit variational inference [2,3] as the intermediate stochastic variable m is only for the approximate posterior. However, such a discussion has not been discussed in the paper and the ELBO expression Eq.(13) seems not to represent the SIVI procedure as well.

learn variational semantic memory, neurips paper, semi-implicit variational inference, (6 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Consumer Health (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.40)

Add feedback

Importance Weighted Hierarchical Variational Inference

Neural Information Processing SystemsOct-10-2024, 02:00:27 GMT

Variational Inference is a powerful tool in the Bayesian modeling toolkit, however, its effectiveness is determined by the expressivity of the utilized variational distributions in terms of their ability to match the true posterior distribution. In turn, the expressivity of the variational family is largely limited by the requirement of having a tractable density function. To overcome this roadblock, we introduce a new family of variational upper bounds on a marginal log-density in the case of hierarchical models (also known as latent variable models). We then derive a family of increasingly tighter variational lower bounds on the otherwise intractable standard evidence lower bound for hierarchical variational distributions, enabling the use of more expressive approximate posteriors. We show that previously known methods, such as Hierarchical Variational Models, Semi-Implicit Variational Inference and Doubly Semi-Implicit Variational Inference can be seen as special cases of the proposed approach, and empirically demonstrate superior performance of the proposed method in a set of experiments.

importance weighted hierarchical variational inference, semi-implicit variational inference, variational inference, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.65)

Add feedback

Hierarchical Semi-Implicit Variational Inference with Application to Diffusion Model Acceleration

Yu, Longlin, Xie, Tianyu, Zhu, Yu, Yang, Tong, Zhang, Xiangyu, Zhang, Cheng

arXiv.org Artificial IntelligenceOct-26-2023

Semi-implicit variational inference (SIVI) has been introduced to expand the analytical variational families by defining expressive semi-implicit distributions in a hierarchical manner. However, the single-layer architecture commonly used in current SIVI methods can be insufficient when the target posterior has complicated structures. In this paper, we propose hierarchical semi-implicit variational inference, called HSIVI, which generalizes SIVI to allow more expressive multi-layer construction of semi-implicit distributions. By introducing auxiliary distributions that interpolate between a simple base distribution and the target distribution, the conditional layers can be trained by progressively matching these auxiliary distributions one layer after another. Moreover, given pre-trained score networks, HSIVI can be used to accelerate the sampling process of diffusion models with the score matching objective. We show that HSIVI significantly enhances the expressiveness of SIVI on several Bayesian inference problems with complicated target distributions. When used for diffusion model acceleration, we show that HSIVI can produce high quality samples comparable to or better than the existing fast diffusion model based samplers with a small number of function evaluations on various datasets.

artificial intelligence, hsivi-sm, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2310.17153

Country: